Faster, Practical GLL Parsing

نویسندگان

  • Ali Afroozeh
  • Anastasia Izmaylova
چکیده

Generalized LL (GLL) parsing is an extension of recursivedescent (RD) parsing that supports all context-free grammars in cubic time and space. GLL parsers have the direct relationship with the grammar that RD parsers have, and therefore, compared to GLR, are easier to understand, debug, and extend. This makes GLL parsing attractive for parsing programming languages. In this paper we propose a more e cient Graph-Structured Stack (GSS) for GLL parsing that leads to significant performance improvement. We also discuss a number of optimizations that further improve the performance of GLL. Finally, for practical scannerless parsing of programming languages, we show how common lexical disambiguation filters can be integrated in GLL parsing. Our new formulation of GLL parsing is implemented as part of the Iguana parsing framework. We evaluate the e↵ectiveness of our approach using a highly-ambiguous grammar and grammars of real programming languages. Our results, compared to the original GLL, show a speedup factor of 10 on the highly-ambiguous grammar, and a speedup factor of 1.5, 1.7, and 5.2 on the grammars of Java, C#, and OCaml, respectively.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Island Grammar-Based Parsing Using GLL and Tom

Extending a language by embedding within it another language presents significant parsing challenges, especially if the embedding is recursive. The composite grammar is likely to be nondeterministic as a result of tokens that are valid in both the host and the embedded language. In this paper we examine the challenges of embedding the Tom language into a variety of general-purpose high level la...

متن کامل

GLL Parsing

Recursive Descent (RD) parsers are popular because their control flow follows the structure of the grammar and hence they are easy to write and to debug. However, the class of grammars which admit RD parsers is very limited. Backtracking techniques may be used to extend this class, but can have explosive runtimes and cannot deal with grammars with left recursion. Tomita-style RNGLR parsers are ...

متن کامل

2D Trie for Fast Parsing

In practical applications, decoding speed is very important. Modern structured learning technique adopts template based method to extract millions of features. Complicated templates bring about abundant features which lead to higher accuracy but more feature extraction time. We propose Two Dimensional Trie (2D Trie), a novel efficient feature indexing structure which takes advantage of relation...

متن کامل

EPSILON FACTOR FOR GLl × GLl ′; l ̸ = l ′ PRIMES

Let F be a non-Archimedean local field with finite residual field of characteristic p. In this article we calculate the ε-factor of pairs for GLl(F ) ×GLl′(F ) where l and l are distinct primes including the case l = p. For this calculation, we use the local Langlands correspondence and non-Galois base change lift. This method leads to the explicit conjecture of the ε-factor of the representati...

متن کامل

ON SOME CONSTANTS IN THE SUPERCUSPIDAL CHARACTERS OF GLl, l A PRIME = p

The article gives explicit values of some constants which appear in the character formula for the irreducible supercuspidal representation of GLl(F ) for F a local field of the residual characteristic p = l.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015